Search CORE

Visualizing genome and systems biology: technologies, tools, implementation techniques and trends, past, present and future.

Author: Enright Anton J
Iliopoulos Ioannis
Malliarakis Dimitris
Papanikolaou Nikolas
Pavlopoulos Georgios A
Theodosiou Theodosis
Publication venue: Gigascience
Publication date: 01/01/2015
Field of study

"Α picture is worth a thousand words." This widely used adage sums up in a few words the notion that a successful visual representation of a concept should enable easy and rapid absorption of large amounts of information. Although, in general, the notion of capturing complex ideas using images is very appealing, would 1000 words be enough to describe the unknown in a research field such as the life sciences? Life sciences is one of the biggest generators of enormous datasets, mainly as a result of recent and rapid technological advances; their complexity can make these datasets incomprehensible without effective visualization methods. Here we discuss the past, present and future of genomic and systems biology visualization. We briefly comment on many visualization and analysis tools and the purposes that they serve. We focus on the latest libraries and programming languages that enable more effective, efficient and faster approaches for visualizing biological concepts, and also comment on the future human-computer interaction trends that would enable for enhancing visualization further

Apollo (Cambridge)

Arena3D: visualization of biological networks in 3D

Author: O'Donoghue Seán I
Pafilis Evangelos
Pavlopoulos Georgios A
Satagopam Venkata P
Schneider Reinhard
Soldatos Theodoros G
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Complexity is a key problem when visualizing biological networks; as the number of entities increases, most graphical views become incomprehensible. Our goal is to enable many thousands of entities to be visualized meaningfully and with high performance. Results We present a new visualization tool, Arena3D, which introduces a new concept of staggered layers in 3D space. Related data – such as proteins, chemicals, or pathways – can be grouped onto separate layers and arranged via layout algorithms, such as Fruchterman-Reingold, distance geometry, and a novel hierarchical layout. Data on a layer can be clustered via k-means, affinity propagation, Markov clustering, neighbor joining, tree clustering, or UPGMA ('unweighted pair-group method with arithmetic mean'). A simple input format defines the name and URL for each node, and defines connections or similarity scores between pairs of nodes. The use of Arena3D is illustrated with datasets related to Huntington's disease. Conclusion Arena3D is a user friendly visualization tool that is able to visualize biological or any other network in 3D space. It is free for academic use and runs on any platform. It can be downloaded or lunched directly from <url>http://arena3d.org</url>. Java3D library and Java 1.5 need to be pre-installed for the software to run.</p

UNSWorks

Corrigendum: A guide to conquer the biological network era using graph theory

Author: David Paez-Espino
Evangelos Karatzas
Evangelos Karatzas
Georgios A. Pavlopoulos
Mikaela Koutrouli
Publication venue: 'Frontiers Media SA'
Publication date: 01/03/2023
Field of study

A reference guide for tree analysis and visualization

Author: Barbosa Da Silva Adriano
Pavlopoulos Georgios A.
Schneider Reinhard
Soldatos Theodoros G.
Publication venue
Publication date: 01/01/2010
Field of study

The quantities of data obtained by the new high-throughput technologies, such as microarrays or ChIP-Chip arrays, and the large-scale OMICS-approaches, such as genomics, proteomics and transcriptomics, are becoming vast. Sequencing technologies become cheaper and easier to use and, thus, large-scale evolutionary studies towards the origins of life for all species and their evolution becomes more and more challenging. Databases holding information about how data are related and how they are hierarchically organized expand rapidly. Clustering analysis is becoming more and more difficult to be applied on very large amounts of data since the results of these algorithms cannot be efficiently visualized. Most of the available visualization tools that are able to represent such hierarchies, project data in 2D and are lacking often the necessary user friendliness and interactivity. For example, the current phylogenetic tree visualization tools are not able to display easy to understand large scale trees with more than a few thousand nodes. In this study, we review tools that are currently available for the visualization of biological trees and analysis, mainly developed during the last decade. We describe the uniform and standard computer readable formats to represent tree hierarchies and we comment on the functionality and the limitations of these tools. We also discuss on how these tools can be developed further and should become integrated with various data sources. Here we focus on freely available software that offers to the users various tree-representation methodologies for biological data analysis

MDC Repository

Empirical Comparison of Visualization Tools for Larger-Scale Network Analysis

Author: David Paez-Espino
Georgios A. Pavlopoulos
Ioannis Iliopoulos
Nikos C. Kyrpides
Publication venue: 'Hindawi Limited'
Publication date
Field of study

LAITOR - Literature Assistant for Identification of Terms co-Occurrences and Relationships

Author: Andrade-Navarro Miguel A
Barbosa-Silva Adriano
Fontaine Jean-Fred
Magalhães Ivan LF
Ortega J Miguel
Pavlopoulos Georgios A
Schneider Reinhard
Soldatos Theodoros G
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Biological knowledge is represented in scientific literature that often describes the function of genes/proteins (bioentities) in terms of their interactions (biointeractions). Such bioentities are often related to biological concepts of interest that are specific of a determined research field. Therefore, the study of the current literature about a selected topic deposited in public databases, facilitates the generation of novel hypotheses associating a set of bioentities to a common context. Results We created a text mining system (LAITOR: <it>Literature Assistant for Identification of Terms co-Occurrences and Relationships</it>) that analyses co-occurrences of bioentities, biointeractions, and other biological terms in MEDLINE abstracts. The method accounts for the position of the co-occurring terms within sentences or abstracts. The system detected abstracts mentioning protein-protein interactions in a standard test (BioCreative II IAS test data) with a precision of 0.82-0.89 and a recall of 0.48-0.70. We illustrate the application of LAITOR to the detection of plant response genes in a dataset of 1000 abstracts relevant to the topic. Conclusions Text mining tools combining the extraction of interacting bioentities and biological concepts with network displays can be helpful in developing reasonable hypotheses in different scientific backgrounds.</p

ReLiance: a machine learning and literature-based prioritization of receptor—ligand pairings.

Author: De Moor B.
Iacucci Ernesto
Moreau Y.
Pavlopoulos Georgios A.
Popovic D.
Schneider Reinhard
Tranchevent L. C.
Publication venue
Publication date: 01/01/2012
Field of study

Motivation: The prediction of receptor—ligand pairings is an important area of research as intercellular communications are mediated by the successful interaction of these key proteins. As the exhaustive assaying of receptor—ligand pairs is impractical, a computational approach to predict pairings is necessary. We propose a workflow to carry out this interaction prediction task, using a text mining approach in conjunction with a state of the art prediction method, as well as a widely accessible and comprehensive dataset. Among several modern classifiers, random forests have been found to be the best at this prediction task. The training of this classifier was carried out using an experimentally validated dataset of Database of Ligand-Receptor Partners (DLRP) receptor—ligand pairs. New examples, co-cited with the training receptors and ligands, are then classified using the trained classifier. After applying our method, we find that we are able to successfully predict receptor—ligand pairs within the GPCR family with a balanced accuracy of 0.96. Upon further inspection, we find several supported interactions that were not present in the Database of Interacting Proteins (DIPdatabase). We have measured the balanced accuracy of our method resulting in high quality predictions stored in the available database ReLiance. Availability: http://homes.esat.kuleuven.be/?bioiuser/ReLianceDB/ index.php Contact: [email protected]; ernesto.iacucci@gmail. com Supplementary information: Supplementary data are available at Bioinformatics onlin

Recommended from our members

UniProt-Related Documents (UniReD): assisting wet lab biologists in their quest on finding novel counterparts in a protein network.

Author: Aivaliotis Michalis
Amoutzias Grigoris D
Bonetto Giulia
Eliopoulos Aristides G
Fakoureli Eirini
Iliopoulos Ioannis
Karagogeos Domna
Maxouri Stella
Nikoletopoulou Vasiliki
Papanikolaou Nikolaos
Pavlopoulos Georgios A
Savvaki Maria
Tavernarakis Nektarios
Theodosiou Theodosios
Tzamarias Dimitris
Publication venue: NAR Genom Bioinform
Publication date: 01/03/2020
Field of study

The in-depth study of protein-protein interactions (PPIs) is of key importance for understanding how cells operate. Therefore, in the past few years, many experimental as well as computational approaches have been developed for the identification and discovery of such interactions. Here, we present UniReD, a user-friendly, computational prediction tool which analyses biomedical literature in order to extract known protein associations and suggest undocumented ones. As a proof of concept, we demonstrate its usefulness by experimentally validating six predicted interactions and by benchmarking it against public databases of experimentally validated PPIs succeeding a high coverage. We believe that UniReD can become an important and intuitive resource for experimental biologists in their quest for finding novel associations within a protein network and a useful tool to complement experimental approaches (e.g. mass spectrometry) by producing sorted lists of candidate proteins for further experimental validation. UniReD is available at http://bioinformatics.med.uoc.gr/unired/

Apollo (Cambridge)

Arena3D: visualizing time-driven phenotypic differences in biological systems

Author: A Agresti
A Marnef
APL Torkamani
B Neumann
B Schwanhäusser
BD MacArthur
E Masry
E Wilson
G Pavlopoulos
Georgios A Pavlopoulos
GM Crippen
Jan Aerts
JH Morris
K Tanaka
M Bayona-Bafaluy
M Cerone
M Costanzo
M Henderson
M Meyer
MA Westenberg
MA Westenberg
Maria Secrier
MG Kendall
P Flicek
R Bourqui
R Kincaid
R Lu
Reinhard Schneider
S Kruglyak
TJ Lopes
V Gache
W Wu
W Yang
Y Benjamini
Publication venue: BioMed Central
Publication date: 01/01/2012
Field of study

Abstract Background Elucidating the genotype-phenotype connection is one of the big challenges of modern molecular biology. To fully understand this connection, it is necessary to consider the underlying networks and the time factor. In this context of data deluge and heterogeneous information, visualization plays an essential role in interpreting complex and dynamic topologies. Thus, software that is able to bring the network, phenotypic and temporal information together is needed. Arena3D has been previously introduced as a tool that facilitates link discovery between processes. It uses a layered display to separate different levels of information while emphasizing the connections between them. We present novel developments of the tool for the visualization and analysis of dynamic genotype-phenotype landscapes. Results Version 2.0 introduces novel features that allow handling time course data in a phenotypic context. Gene expression levels or other measures can be loaded and visualized at different time points and phenotypic comparison is facilitated through clustering and correlation display or highlighting of impacting changes through time. Similarity scoring allows the identification of global patterns in dynamic heterogeneous data. In this paper we demonstrate the utility of the tool on two distinct biological problems of different scales. First, we analyze a medium scale dataset that looks at perturbation effects of the pluripotency regulator Nanog in murine embryonic stem cells. Dynamic cluster analysis suggests alternative indirect links between Nanog and other proteins in the core stem cell network. Moreover, recurrent correlations from the epigenetic to the translational level are identified. Second, we investigate a large scale dataset consisting of genome-wide knockdown screens for human genes essential in the mitotic process. Here, a potential new role for the gene <it>lsm14a </it>in cytokinesis is suggested. We also show how phenotypic patterning allows for extensive comparison and identification of high impact knockdown targets. Conclusions We present a new visualization approach for perturbation screens with multiple phenotypic outcomes. The novel functionality implemented in Arena3D enables effective understanding and comparison of temporal patterns within morphological layers, to help with the system-wide analysis of dynamic processes. Arena3D is available free of charge for academics as a downloadable standalone application from: <url>http://arena3d.org/</url>.</p

Springer